Welcome to the Time and Emotion Minimum Viable Product

We will walk through the steps to parse and aggregate some of the new timesheets that we suggest knowlege workers should use to improve their lot.

BEFORE running the code you need to find the location that this notebook is stored to and then create 2 subfolders: logs and html

Copy the timesheet examples to the logs folder.

Copy the html snippets to the html folder.


In [7]:
import os
import pandas as pd
import webbrowser

DATA_IN = 'logs'
HTML_IN = 'html'
HEADER = 'html/chord_header.html'
FOOTER = 'html/chord_footer.html'
FILE_OUT = 'html/chord_gen.html'

KEY_ERROR = 'Bad Key reported'

START_COL = 8
FINISH_COL = 9
COLUMN_NAMES = ['Priority', 'Description', 'Deadline', 'StartTime', 'EndTime', 'Duration', 'Project', 'Filler_1',
                'StartState', 'FinishState', 'Learn_Apply', 'Attention', 'Source', 'Notes', 'Reflection']

SHEET_NAMES = ('Monday', 'Tuesday', 'Wednesday', 'Thursday', 'Friday')
FLOW_STATES = ['Annoyed', 'Anxiety', 'Attentive', 'Bored', 'Flow', 'Panic', 'Puzzled', 'Relaxed']

Lets Parse the files

Use the location specified in DATA_IN, above, for the log files


In [8]:
df_total = pd.DataFrame(columns=COLUMN_NAMES)

for root_folder, dirs, files in os.walk(DATA_IN):
    for file_name in files:
        if '~' != file_name[0]:
            xlfile = pd.ExcelFile(os.path.join(DATA_IN, file_name))
            sheet_names = xlfile.sheet_names

            for sheet_name in sheet_names:
                if sheet_name in SHEET_NAMES:
                    df = xlfile.parse(sheet_name)
                    if (len(df.columns)) != 15:
                        print '% s %s %d' % (file_name, sheet_name, len(df.columns))

                    df.columns = COLUMN_NAMES
                    # TODO ensure the datetime columns are the correct format
                    # TODO calculate the timestamps using the filename and appropriate times

                    try:
                        df_total = pd.concat([df_total, df[(df.ix[:, 8].isin(FLOW_STATES) & df.ix[:, 9]
                                                      .isin(FLOW_STATES))]])
                    except KeyError:
                        print '%s %s %s ' % (KEY_ERROR, file_name, sheet_name)

s_temp = df_total.groupby([df_total.FinishState, df_total.StartState], as_index=False).size()

df_chord = s_temp.unstack().fillna(0)

# Add any missing columns
for col_name in FLOW_STATES:
    if col_name not in df_chord.columns:
        df_chord.insert(FLOW_STATES.index(col_name), col_name, 0)

print df_chord


StartState   Annoyed  Anxiety  Attentive  Bored  Flow  Panic  Puzzled  Relaxed
FinishState                                                                   
Annoyed            7        2          6      1     0      0        2        5
Anxiety            0        4          1      1     0      0        3        3
Attentive          1        4         77      1     0      0        6        3
Bored              1        0          4     14     0      0        0        8
Flow               0        0          5      1     3      0        0        8
Panic              0        1          0      0     0      0        0        0
Puzzled            0        2          2      1     0      0        3        0
Relaxed            1        1         21      1     0      0       10       58

Build the chord diagram


In [9]:
s_data = str((df_chord.as_matrix()))

s_data = s_data.replace('.', ',')
s_data = s_data.replace(',]', '],', 7)
s_data = s_data.replace(',]]', ']]')

content = open(HEADER).read()
content += 'var matrix = %s;' % s_data
content += open(FOOTER).read()

fout = open(FILE_OUT, 'w')
fout.write(content)
fout.flush()
fout.close()

path = os.path.abspath(FILE_OUT)
url = 'file://' + path

with open(path, 'w') as f:
    f.write(content)
    webbrowser.open(url)